NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Learning useful representations for shifting tasks and distributions

Zhang, Jianyu; Bottou, Léon (July 2023, ICML 2023)

Does the dominant approach to learn representations (as a side effect of optimizing an expected cost for a single training distribution) remain a good approach when we are dealing with multiple distributions? Our thesis is that such scenarios are better served by representations that are richer than those obtained with a single optimization episode. We support this thesis with simple theoretical arguments and with experiments utilizing an apparently na\"ıve ensembling technique: concatenating the representations obtained from multiple training episodes using the same data, model, algorithm, and hyper-parameters, but different random seeds. These independently trained networks perform similarly. Yet, in a number of scenarios involving new distributions, the concatenated representation performs substantially better than an equivalently sized network trained with a single training run. This proves that the representations constructed by multiple training episodes are in fact different. Although their concatenation carries little additional information about the training task under the training distribution, it becomes substantially more informative when tasks or distributions change. Meanwhile, a single training episode is unlikely to yield such a redundant representation because the optimization process has no reason to accumulate features that do not incrementally improve the training performance.
more » « less
Full Text Available
Rich Feature Construction for the Optimization-Generalization Dilemma

https://doi.org/10.48550/arXiv.2203.15516

Zhang, Jianyu; Lopez-Paz, David; Bottou, Léon (July 2022, International Conference on Machine Learning)

There often is a dilemma between ease of optimization and robust out-of-distribution (OoD) generalization. For instance, many OoD methods rely on penalty terms whose optimization is challenging. They are either too strong to optimize reliably or too weak to achieve their goals. We propose to initialize the networks with a rich representation containing a palette of potentially useful features, ready to be used by even simple models. On the one hand, a rich representation provides a good initialization for the optimizer. On the other hand, it also provides an inductive bias that helps OoD generalization. Such a representation is constructed with the Rich Feature Construction (RFC) algorithm, also called the Bonsai algorithm, which consists of a succession of training episodes. During discovery episodes, we craft a multi-objective optimization criterion and its associated datasets in a manner that prevents the network from using the features constructed in the previous iterations. During synthesis episodes, we use knowledge distillation to force the network to simultaneously represent all the previously discovered features. Initializing the networks with Bonsai representations consistently helps six OoD methods achieve top performance on ColoredMNIST benchmark. The same technique substantially outperforms comparable results on the Wilds Camelyon17 task, eliminates the high result variance that plagues other methods, and makes hyperparameter tuning and model selection more reliable.
more » « less
Full Text Available
Chiral Spin-Wave Velocities Induced by All-Garnet Interfacial Dzyaloshinskii-Moriya Interaction in Ultrathin Yttrium Iron Garnet Films

https://doi.org/10.1103/PhysRevLett.124.027203

Wang, Hanchen; Chen, Jilei; Liu, Tao; Zhang, Jianyu; Baumgaertl, Korbinian; Guo, Chenyang; Li, Yuehui; Liu, Chuanpu; Che, Ping; Tu, Sa; et al (January 2020, Physical Review Letters)

Full Text Available
Excitation of unidirectional exchange spin waves by a nanoscale magnetic grating

https://doi.org/10.1103/PhysRevB.100.104427

Chen, Jilei; Yu, Tao; Liu, Chuanpu; Liu, Tao; Madami, Marco; Shen, Ka; Zhang, Jianyu; Tu, Sa; Alam, Md Shah; Xia, Ke; et al (September 2019, Physical Review B)

Full Text Available
Current-controlled propagation of spin waves in antiparallel, coupled domains

https://doi.org/10.1038/s41565-019-0429-7

Liu, Chuanpu; Wu, Shizhe; Zhang, Jianyu; Chen, Jilei; Ding, Jinjun; Ma, Ji; Zhang, Yuelin; Sun, Yuanwei; Tu, Sa; Wang, Hanchen; et al (July 2019, Nature Nanotechnology)

Full Text Available
Measurement of exclusive $$J/\psi$$ and $$\psi(2S)$$ production at $$\sqrt{s}=13$$ TeV

https://doi.org/10.21468/SciPostPhys.18.2.071

Collaboration, LHCb; Aaij, Roel; Abdelmotteleb, Ahmed_Sameh Wagih; Abellan_Beteta, Carlos; Abudinèn, Fernando Jesus; Ackernley, Thomas; Adefisoye, Ayomide Matthew; Adeva, Bernardo; Adinolfi, Marco; Adlarson, Patrik Harri; et al (January 2025, SciPost Physics)

Measurements are presented of the cross-section for the central exclusive production ofJ/\psi\to\mu^+\mu^- $J / ψ \to μ^{+} μ^{-}$ and\psi(2S)\to\mu^+\mu^- $ψ (2 S) \to μ^{+} μ^{-}$ processes in proton-proton collisions at\sqrt{s} = 13 \ \mathrm{TeV} $\sqrt{s} = 13 T e V$ with 2016–2018 data. They are performed by requiring both muons to be in the LHCb acceptance (with pseudorapidity2<\eta_{\mu^±} < 4.5 $2 < η_{μ^{\pm}} < 4.5$ ) and mesons in the rapidity range2.0 < y < 4.5 $2.0 < y < 4.5$ . The integrated cross-section results are\sigma_{J/\psi\to\mu^+\mu^-}(2.0 $σ_{J / ψ \to μ^{+} μ^{-}} (2.0 < y_{J / ψ} < 4.5, 2.0 < η_{μ^{\pm}} < 4.5) = 400 \pm 2 \pm 5 \pm 12 p b, σ_{ψ (2 S) \to μ^{+} μ^{-}} (2.0 < y_{ψ (2 S)} < 4.5, 2.0 < η_{μ^{\pm}} < 4.5) = 9.40 \pm 0.15 \pm 0.13 \pm 0.27 p b,$ where the uncertainties are statistical, systematic and due to the luminosity determination. In addition, a measurement of the ratio of\psi(2S) $ψ (2 S)$ andJ/\psi $J / ψ$ cross-sections, at an average photon-proton centre-of-mass energy of1\ \mathrm{TeV} $1 T e V$ , is performed, giving$ = 0.1763 ± 0.0029 ± 0.0008 ± 0.0039,$$ where the first uncertainty is statistical, the second systematic and the third due to the knowledge of the involved branching fractions. For the first time, the dependence of theJ/\psi$ $J / ψ$ and\psi(2S) $ψ (2 S)$ cross-sections on the total transverse momentum transfer is determined inpp $p p$ collisions and is found consistent with the behaviour observed in electron-proton collisions.
more » « less
Full Text Available
Measurement of the absolute branching fraction of the inclusive decay ${\bar{Λ}}_{c}^{-} \to \bar{n} + X$

https://doi.org/10.1103/PhysRevD.108.L031101

Ablikim, M; Achasov, M N; Adlarson, P; Albrecht, M; Aliberti, R; Amoroso, A; An, M R; An, Q; Bai, X H; Bai, Y; et al (August 2023, Physical Review D)

Full Text Available

Search for: All records